Bibliography

[1] Gustavo Aguilar, Yuan Ling, Yu Zhang, Benjamin Yao, Xing Fan, and Chenlei Guo.

Knowledge distillation from internal representations.

In Proceedings of the AAAI

Conference on Artificial Intelligence, pages 7350–7357, 2020.

[2] Milad Alizadeh, Javier Fern´andez-Marqu´es, Nicholas D Lane, and Yarin Gal.

An

empirical study of binary neural networks’ optimisation. In Proceedings of the Inter-

national Conference on Learning Representations, 2018.

[3] Zeyuan Allen-Zhu and Yuanzhi Li. Towards understanding ensemble, knowledge dis-

tillation and self-distillation in deep learning. arXiv preprint arXiv:2012.09816, 2020.

[4] Martin Arjovsky, Soumith Chintala, and L´eon Bottou. Wasserstein generative adver-

sarial networks. In Proceedings of the International Conference on Machine Learning,

pages 214–223, 2017.

[5] Haoli Bai, Lu Hou, Lifeng Shang, Xin Jiang, Irwin King, and Michael R Lyu. Towards

efficient post-training quantization of pre-trained language models. arXiv preprint

arXiv:2109.15082, 2021.

[6] Haoli Bai, Wei Zhang, Lu Hou, Lifeng Shang, Jing Jin, Xin Jiang, Qun Liu, Michael

Lyu, and Irwin King.

Binarybert: Pushing the limit of bert quantization.

arXiv

preprint arXiv:2012.15701, 2020.

[7] Slawomir Bak, Peter Carr, and Jean-Francois Lalonde. Domain adaptation through

synthesis for unsupervised person re-identification. In Proceedings of the European

Conference on Computer Vision, pages 189–205, 2018.

[8] Ron Banner, Itay Hubara, Elad Hoffer, and Daniel Soudry. Scalable methods for 8-bit

training of neural networks. Advances in neural information processing systems, 31,

2018.

[9] Yoshua Bengio, Nicholas L´eonard, and Aaron Courville. Estimating or propagating

gradients through stochastic neurons for conditional computation.

arXiv preprint

arXiv:1308.3432, 2013.

[10] Joseph Bethge, Christian Bartz, Haojin Yang, Ying Chen, and Christoph Meinel.

Meliusnet: Can binary neural networks achieve mobilenet-level accuracy?

arXiv

preprint arXiv:2001.05936, 2020.

[11] Joseph Bethge, Marvin Bornstein, Adrian Loy, Haojin Yang, and Christoph

Meinel. Training competitive binary neural networks from scratch. arXiv preprint

arXiv:1812.01965, 2018.

[12] Joseph Bethge, Haojin Yang, Marvin Bornstein, and Christoph Meinel.

Binary-

densenet: developing an architecture for binary neural networks. In Proceedings of

the IEEE/CVF International Conference on Computer Vision Workshops, pages 0–0,

2019.

179